Overview

Dataset statistics

 Original DatasetOversampled Dataset
Number of variables99
Number of observations188862
Missing cells00
Missing cells (%)0.0%0.0%
Duplicate rows0176
Duplicate rows (%)0.0%20.4%
Total size in memory13.3 KiB67.3 KiB
Average record size in memory72.7 B80.0 B

Variable types

 Original DatasetOversampled Dataset
Categorical44
Numeric55

Alerts

Original DatasetOversampled Dataset
time is highly overall correlated with distanceAlert not present in High Correlation
line_width is highly overall correlated with roughnessAlert not present in High Correlation
roughness is highly overall correlated with line_widthAlert not present in High Correlation
distance is highly overall correlated with timeAlert not present in High Correlation
ink_visco_cp is highly overall correlated with surface_tension_dyne_cm and 1 other fieldsAlert not present in High Correlation
surface_tension_dyne_cm is highly overall correlated with ink_visco_cp and 1 other fieldsAlert not present in High Correlation
ink _density is highly overall correlated with ink_visco_cp and 1 other fieldsAlert not present in High Correlation
overspray has 8 (4.3%) zeros overspray has 20 (2.3%) zeros Zeros
Alert not present in Dataset has 176 (20.4%) duplicate rowsDuplicates
Alert not present in distance has a high cardinality: 94 distinct values High Cardinality
Alert not present in ink_visco_cp has a high cardinality: 180 distinct values High Cardinality
Alert not present in surface_tension_dyne_cm has a high cardinality: 180 distinct values High Cardinality
Alert not present in distance is highly imbalanced (59.9%) Imbalance
Alert not present in ink_visco_cp is highly imbalanced (61.0%) Imbalance
Alert not present in surface_tension_dyne_cm is highly imbalanced (61.0%) Imbalance
Alert not present in ink _density is highly imbalanced (60.6%) Imbalance

Reproduction

 Original DatasetOversampled Dataset
Analysis started2023-04-15 11:10:16.8746192023-04-15 11:10:20.934799
Analysis finished2023-04-15 11:10:20.9106202023-04-15 11:10:24.610375
Duration4.04 seconds3.68 seconds
Software versionydata-profiling vv4.1.2ydata-profiling vv4.1.2
Download configurationconfig.jsonconfig.json

Variables

distance
Categorical

 Original DatasetOversampled Dataset
Distinct394
Distinct (%)1.6%10.9%
Missing00
Missing (%)0.0%0.0%
Memory size1.6 KiB13.5 KiB
900
139 
300
47 
270
 
2
900
519 
300
169 
270
 
8
907
 
6
892
 
5
Other values (89)
155 

Length

 Original DatasetOversampled Dataset
Max length33
Median length33
Mean length33
Min length33

Characters and Unicode

 Original DatasetOversampled Dataset
Total characters5642586
Distinct characters510
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Original DatasetOversampled Dataset
Unique052 ?
Unique (%)0.0%6.0%

Sample

 Original DatasetOversampled Dataset
1st row270900
2nd row270887
3rd row300867
4th row300900
5th row300887

Common Values

ValueCountFrequency (%)
900 139
73.9%
300 47
 
25.0%
270 2
 
1.1%
ValueCountFrequency (%)
900 519
60.2%
300 169
 
19.6%
270 8
 
0.9%
907 6
 
0.7%
892 5
 
0.6%
887 5
 
0.6%
908 4
 
0.5%
893 4
 
0.5%
901 4
 
0.5%
891 4
 
0.5%
Other values (84) 134
 
15.5%

Length

2023-04-15T05:10:24.674214image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Original Dataset

2023-04-15T05:10:24.801839image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset


Number of variable categories passes threshold (config.plot.cat_freq.max_unique)
ValueCountFrequency (%)
900 139
73.9%
300 47
 
25.0%
270 2
 
1.1%
ValueCountFrequency (%)
900 519
60.2%
300 169
 
19.6%
270 8
 
0.9%
907 6
 
0.7%
892 5
 
0.6%
887 5
 
0.6%
898 4
 
0.5%
880 4
 
0.5%
891 4
 
0.5%
901 4
 
0.5%
Other values (84) 134
 
15.5%

Most occurring characters

ValueCountFrequency (%)
0 374
66.3%
9 139
 
24.6%
3 47
 
8.3%
2 2
 
0.4%
7 2
 
0.4%
ValueCountFrequency (%)
0 1428
55.2%
9 634
24.5%
3 198
 
7.7%
8 112
 
4.3%
2 67
 
2.6%
7 51
 
2.0%
1 39
 
1.5%
6 25
 
1.0%
4 17
 
0.7%
5 15
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 564
100.0%
ValueCountFrequency (%)
Decimal Number 2586
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 374
66.3%
9 139
 
24.6%
3 47
 
8.3%
2 2
 
0.4%
7 2
 
0.4%
ValueCountFrequency (%)
0 1428
55.2%
9 634
24.5%
3 198
 
7.7%
8 112
 
4.3%
2 67
 
2.6%
7 51
 
2.0%
1 39
 
1.5%
6 25
 
1.0%
4 17
 
0.7%
5 15
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Common 564
100.0%
ValueCountFrequency (%)
Common 2586
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 374
66.3%
9 139
 
24.6%
3 47
 
8.3%
2 2
 
0.4%
7 2
 
0.4%
ValueCountFrequency (%)
0 1428
55.2%
9 634
24.5%
3 198
 
7.7%
8 112
 
4.3%
2 67
 
2.6%
7 51
 
2.0%
1 39
 
1.5%
6 25
 
1.0%
4 17
 
0.7%
5 15
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 564
100.0%
ValueCountFrequency (%)
ASCII 2586
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 374
66.3%
9 139
 
24.6%
3 47
 
8.3%
2 2
 
0.4%
7 2
 
0.4%
ValueCountFrequency (%)
0 1428
55.2%
9 634
24.5%
3 198
 
7.7%
8 112
 
4.3%
2 67
 
2.6%
7 51
 
2.0%
1 39
 
1.5%
6 25
 
1.0%
4 17
 
0.7%
5 15
 
0.6%

time
Real number (ℝ)

 Original DatasetOversampled Dataset
Distinct63429
Distinct (%)33.5%49.8%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean71.13829870.227617
 Original DatasetOversampled Dataset
Minimum3130.523679
Maximum130130
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size1.6 KiB13.5 KiB
2023-04-15T05:10:24.940766image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

 Original DatasetOversampled Dataset
Minimum3130.523679
5-th percentile3434
Q14545.063302
median6969
Q389.2588.451404
95-th percentile111.25108
Maximum130130
Range9999.476321
Interquartile range (IQR)44.2543.388102

Descriptive statistics

 Original DatasetOversampled Dataset
Standard deviation24.6882623.990177
Coefficient of variation (CV)0.347045980.34160603
Kurtosis-0.82393317-0.86230621
Mean71.13829870.227617
Median Absolute Deviation (MAD)21.519.584849
Skewness0.169269310.075444651
Sum1337460536.206
Variance609.51018575.5286
MonotonicityNot monotonicNot monotonic
2023-04-15T05:10:25.105773image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
78 9
 
4.8%
44 9
 
4.8%
66 8
 
4.3%
96 8
 
4.3%
38 8
 
4.3%
63 7
 
3.7%
107 6
 
3.2%
61 6
 
3.2%
83 5
 
2.7%
60 5
 
2.7%
Other values (53) 117
62.2%
ValueCountFrequency (%)
44 24
 
2.8%
61 23
 
2.7%
78 22
 
2.6%
34 20
 
2.3%
63 20
 
2.3%
96 20
 
2.3%
38 19
 
2.2%
66 15
 
1.7%
107 15
 
1.7%
83 14
 
1.6%
Other values (419) 670
77.7%
ValueCountFrequency (%)
31 2
 
1.1%
32 4
2.1%
34 5
2.7%
35 1
 
0.5%
36 2
 
1.1%
37 2
 
1.1%
38 8
4.3%
39 1
 
0.5%
40 4
2.1%
41 2
 
1.1%
ValueCountFrequency (%)
30.52367924 1
 
0.1%
30.86781174 1
 
0.1%
31 6
0.7%
31.14408893 1
 
0.1%
31.36784489 1
 
0.1%
31.67650766 1
 
0.1%
32 9
1.0%
32.05845064 1
 
0.1%
32.19108401 1
 
0.1%
32.23742243 1
 
0.1%
ValueCountFrequency (%)
30.52367924 1
 
0.5%
30.86781174 1
 
0.5%
31 6
3.2%
31.14408893 1
 
0.5%
31.36784489 1
 
0.5%
31.67650766 1
 
0.5%
32 9
4.8%
32.05845064 1
 
0.5%
32.19108401 1
 
0.5%
32.23742243 1
 
0.5%
ValueCountFrequency (%)
31 2
 
0.2%
32 4
0.5%
34 5
0.6%
35 1
 
0.1%
36 2
 
0.2%
37 2
 
0.2%
38 8
0.9%
39 1
 
0.1%
40 4
0.5%
41 2
 
0.2%

velocity
Real number (ℝ)

 Original DatasetOversampled Dataset
Distinct73443
Distinct (%)38.8%51.4%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean10.46242810.684389
 Original DatasetOversampled Dataset
Minimum6.6676.667
Maximum15.51724115.517241
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size1.6 KiB13.5 KiB
2023-04-15T05:10:25.262640image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

 Original DatasetOversampled Dataset
Minimum6.6676.667
5-th percentile6.8186.977
Q18.276758.411215
median9.94510.227
Q312.903513.160968
95-th percentile14.913914.949306
Maximum15.51724115.517241
Range8.85024148.8502414
Interquartile range (IQR)4.626754.7497527

Descriptive statistics

 Original DatasetOversampled Dataset
Standard deviation2.63906372.5732038
Coefficient of variation (CV)0.252241990.24083772
Kurtosis-1.181831-1.2321693
Mean10.46242810.684389
Median Absolute Deviation (MAD)2.052.119
Skewness0.320231160.26341828
Sum1966.93659209.9432
Variance6.96465736.621378
MonotonicityNot monotonicNot monotonic
2023-04-15T05:10:25.412175image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9.375 12
 
6.4%
6.818 9
 
4.8%
11.538 9
 
4.8%
13.636 6
 
3.2%
7.895 6
 
3.2%
14.754 6
 
3.2%
10.843 5
 
2.7%
15 5
 
2.7%
8.411214953 5
 
2.7%
10.345 5
 
2.7%
Other values (63) 120
63.8%
ValueCountFrequency (%)
9.375 29
 
3.4%
6.818 24
 
2.8%
14.754 23
 
2.7%
11.538 22
 
2.6%
10.345 14
 
1.6%
7.895 14
 
1.6%
10.843 14
 
1.6%
15 13
 
1.5%
9.677 12
 
1.4%
7.5 12
 
1.4%
Other values (433) 685
79.5%
ValueCountFrequency (%)
6.667 4
2.1%
6.818 9
4.8%
6.923 2
 
1.1%
6.976744186 1
 
0.5%
6.977 2
 
1.1%
7.142857143 1
 
0.5%
7.143 2
 
1.1%
7.317 1
 
0.5%
7.317073171 1
 
0.5%
7.5 4
2.1%
ValueCountFrequency (%)
6.667 6
 
0.7%
6.818 24
2.8%
6.879096645 1
 
0.1%
6.888021761 1
 
0.1%
6.892881424 1
 
0.1%
6.894332156 1
 
0.1%
6.923 5
 
0.6%
6.943469374 1
 
0.1%
6.948522364 1
 
0.1%
6.976744186 2
 
0.2%
ValueCountFrequency (%)
6.667 6
 
3.2%
6.818 24
12.8%
6.879096645 1
 
0.5%
6.888021761 1
 
0.5%
6.892881424 1
 
0.5%
6.894332156 1
 
0.5%
6.923 5
 
2.7%
6.943469374 1
 
0.5%
6.948522364 1
 
0.5%
6.976744186 2
 
1.1%
ValueCountFrequency (%)
6.667 4
0.5%
6.818 9
1.0%
6.923 2
 
0.2%
6.976744186 1
 
0.1%
6.977 2
 
0.2%
7.142857143 1
 
0.1%
7.143 2
 
0.2%
7.317 1
 
0.1%
7.317073171 1
 
0.1%
7.5 4
0.5%

ink_visco_cp
Categorical

 Original DatasetOversampled Dataset
Distinct2180
Distinct (%)1.1%20.9%
Missing00
Missing (%)0.0%0.0%
Memory size1.6 KiB13.5 KiB
6.9
140 
6.3
48 
6.9
512 
6.3
172 
6.888283511341732
 
1
6.900157990948587
 
1
6.917195772304838
 
1
Other values (175)
175 

Length

 Original DatasetOversampled Dataset
Max length318
Median length33
Mean length35.9048724
Min length33

Characters and Unicode

 Original DatasetOversampled Dataset
Total characters5645090
Distinct characters411
Distinct categories22 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Original DatasetOversampled Dataset
Unique0178 ?
Unique (%)0.0%20.6%

Sample

 Original DatasetOversampled Dataset
1st row6.36.9
2nd row6.36.899449847960728
3rd row6.36.9138527799555165
4th row6.36.9
5th row6.36.9016898192137734

Common Values

ValueCountFrequency (%)
6.9 140
74.5%
6.3 48
 
25.5%
ValueCountFrequency (%)
6.9 512
59.4%
6.3 172
 
20.0%
6.888283511341732 1
 
0.1%
6.900157990948587 1
 
0.1%
6.917195772304838 1
 
0.1%
6.9129177749082205 1
 
0.1%
6.9002884490082534 1
 
0.1%
6.909649130915563 1
 
0.1%
6.907051779135069 1
 
0.1%
6.915000355459548 1
 
0.1%
Other values (170) 170
 
19.7%

Length

2023-04-15T05:10:25.548579image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Original Dataset

2023-04-15T05:10:25.698834image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset


Number of variable categories passes threshold (config.plot.cat_freq.max_unique)
ValueCountFrequency (%)
6.9 140
74.5%
6.3 48
 
25.5%
ValueCountFrequency (%)
6.9 512
59.4%
6.3 172
 
20.0%
6.477601110555395 1
 
0.1%
6.905146894451139 1
 
0.1%
6.300508926384501 1
 
0.1%
6.89313303437789 1
 
0.1%
6.9138527799555165 1
 
0.1%
6.9016898192137734 1
 
0.1%
6.905067810928166 1
 
0.1%
6.90827764879783 1
 
0.1%
Other values (170) 170
 
19.7%

Most occurring characters

ValueCountFrequency (%)
6 188
33.3%
. 188
33.3%
9 140
24.8%
3 48
 
8.5%
ValueCountFrequency (%)
6 1107
21.7%
. 862
16.9%
9 835
16.4%
3 427
 
8.4%
8 298
 
5.9%
0 280
 
5.5%
7 278
 
5.5%
2 262
 
5.1%
5 255
 
5.0%
1 253
 
5.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 376
66.7%
Other Punctuation 188
33.3%
ValueCountFrequency (%)
Decimal Number 4228
83.1%
Other Punctuation 862
 
16.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 188
50.0%
9 140
37.2%
3 48
 
12.8%
ValueCountFrequency (%)
6 1107
26.2%
9 835
19.7%
3 427
 
10.1%
8 298
 
7.0%
0 280
 
6.6%
7 278
 
6.6%
2 262
 
6.2%
5 255
 
6.0%
1 253
 
6.0%
4 233
 
5.5%
Other Punctuation
ValueCountFrequency (%)
. 188
100.0%
ValueCountFrequency (%)
. 862
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 564
100.0%
ValueCountFrequency (%)
Common 5090
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
6 188
33.3%
. 188
33.3%
9 140
24.8%
3 48
 
8.5%
ValueCountFrequency (%)
6 1107
21.7%
. 862
16.9%
9 835
16.4%
3 427
 
8.4%
8 298
 
5.9%
0 280
 
5.5%
7 278
 
5.5%
2 262
 
5.1%
5 255
 
5.0%
1 253
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 564
100.0%
ValueCountFrequency (%)
ASCII 5090
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 188
33.3%
. 188
33.3%
9 140
24.8%
3 48
 
8.5%
ValueCountFrequency (%)
6 1107
21.7%
. 862
16.9%
9 835
16.4%
3 427
 
8.4%
8 298
 
5.9%
0 280
 
5.5%
7 278
 
5.5%
2 262
 
5.1%
5 255
 
5.0%
1 253
 
5.0%
 Original DatasetOversampled Dataset
Distinct2180
Distinct (%)1.1%20.9%
Missing00
Missing (%)0.0%0.0%
Memory size1.6 KiB13.5 KiB
32.3
140 
30.9
48 
32.3
512 
30.9
172 
32.289824193742916
 
1
32.33416428461478
 
1
32.337839862836155
 
1
Other values (175)
175 

Length

 Original DatasetOversampled Dataset
Max length418
Median length44
Mean length46.7610209
Min length44

Characters and Unicode

 Original DatasetOversampled Dataset
Total characters7525828
Distinct characters511
Distinct categories22 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Original DatasetOversampled Dataset
Unique0178 ?
Unique (%)0.0%20.6%

Sample

 Original DatasetOversampled Dataset
1st row30.932.3
2nd row30.932.35037257232981
3rd row30.932.27699215621301
4th row30.932.3
5th row30.932.242592083460096

Common Values

ValueCountFrequency (%)
32.3 140
74.5%
30.9 48
 
25.5%
ValueCountFrequency (%)
32.3 512
59.4%
30.9 172
 
20.0%
32.289824193742916 1
 
0.1%
32.33416428461478 1
 
0.1%
32.337839862836155 1
 
0.1%
32.315835398216684 1
 
0.1%
32.32454375656937 1
 
0.1%
32.253186474102655 1
 
0.1%
32.34299183201621 1
 
0.1%
32.344197808220876 1
 
0.1%
Other values (170) 170
 
19.7%

Length

2023-04-15T05:10:25.841597image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Original Dataset

2023-04-15T05:10:26.002263image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset


Number of variable categories passes threshold (config.plot.cat_freq.max_unique)
ValueCountFrequency (%)
32.3 140
74.5%
30.9 48
 
25.5%
ValueCountFrequency (%)
32.3 512
59.4%
30.9 172
 
20.0%
31.31440259129592 1
 
0.1%
32.24935892512628 1
 
0.1%
30.939252092383672 1
 
0.1%
32.286937795210164 1
 
0.1%
32.27699215621301 1
 
0.1%
32.242592083460096 1
 
0.1%
32.36271639218698 1
 
0.1%
32.28922226040807 1
 
0.1%
Other values (170) 170
 
19.7%

Most occurring characters

ValueCountFrequency (%)
3 328
43.6%
. 188
25.0%
2 140
18.6%
0 48
 
6.4%
9 48
 
6.4%
ValueCountFrequency (%)
3 1683
28.9%
2 923
15.8%
. 862
14.8%
0 448
 
7.7%
9 445
 
7.6%
8 259
 
4.4%
6 257
 
4.4%
4 256
 
4.4%
7 236
 
4.0%
5 231
 
4.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 564
75.0%
Other Punctuation 188
 
25.0%
ValueCountFrequency (%)
Decimal Number 4966
85.2%
Other Punctuation 862
 
14.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 328
58.2%
2 140
24.8%
0 48
 
8.5%
9 48
 
8.5%
ValueCountFrequency (%)
3 1683
33.9%
2 923
18.6%
0 448
 
9.0%
9 445
 
9.0%
8 259
 
5.2%
6 257
 
5.2%
4 256
 
5.2%
7 236
 
4.8%
5 231
 
4.7%
1 228
 
4.6%
Other Punctuation
ValueCountFrequency (%)
. 188
100.0%
ValueCountFrequency (%)
. 862
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 752
100.0%
ValueCountFrequency (%)
Common 5828
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 328
43.6%
. 188
25.0%
2 140
18.6%
0 48
 
6.4%
9 48
 
6.4%
ValueCountFrequency (%)
3 1683
28.9%
2 923
15.8%
. 862
14.8%
0 448
 
7.7%
9 445
 
7.6%
8 259
 
4.4%
6 257
 
4.4%
4 256
 
4.4%
7 236
 
4.0%
5 231
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 752
100.0%
ValueCountFrequency (%)
ASCII 5828
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 328
43.6%
. 188
25.0%
2 140
18.6%
0 48
 
6.4%
9 48
 
6.4%
ValueCountFrequency (%)
3 1683
28.9%
2 923
15.8%
. 862
14.8%
0 448
 
7.7%
9 445
 
7.6%
8 259
 
4.4%
6 257
 
4.4%
4 256
 
4.4%
7 236
 
4.0%
5 231
 
4.0%

ink _density
Categorical

 Original DatasetOversampled Dataset
Distinct248
Distinct (%)1.1%5.6%
Missing00
Missing (%)0.0%0.0%
Memory size1.6 KiB13.5 KiB
1614
140 
1517
48 
1614
530 
1517
179 
1613
 
15
1612
 
14
1611
 
10
Other values (43)
114 

Length

 Original DatasetOversampled Dataset
Max length44
Median length44
Mean length44
Min length44

Characters and Unicode

 Original DatasetOversampled Dataset
Total characters7523448
Distinct characters510
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Original DatasetOversampled Dataset
Unique026 ?
Unique (%)0.0%3.0%

Sample

 Original DatasetOversampled Dataset
1st row15171614
2nd row15171616
3rd row15171617
4th row15171614
5th row15171612

Common Values

ValueCountFrequency (%)
1614 140
74.5%
1517 48
 
25.5%
ValueCountFrequency (%)
1614 530
61.5%
1517 179
 
20.8%
1613 15
 
1.7%
1612 14
 
1.6%
1611 10
 
1.2%
1617 9
 
1.0%
1512 8
 
0.9%
1615 8
 
0.9%
1516 8
 
0.9%
1610 8
 
0.9%
Other values (38) 73
 
8.5%

Length

2023-04-15T05:10:26.118009image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Original Dataset

2023-04-15T05:10:26.258167image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset


Number of variable categories passes threshold (config.plot.cat_freq.max_unique)
ValueCountFrequency (%)
1614 140
74.5%
1517 48
 
25.5%
ValueCountFrequency (%)
1614 530
61.5%
1517 179
 
20.8%
1613 15
 
1.7%
1612 14
 
1.6%
1611 10
 
1.2%
1617 9
 
1.0%
1516 8
 
0.9%
1610 8
 
0.9%
1615 8
 
0.9%
1512 8
 
0.9%
Other values (38) 73
 
8.5%

Most occurring characters

ValueCountFrequency (%)
1 376
50.0%
6 140
 
18.6%
4 140
 
18.6%
5 48
 
6.4%
7 48
 
6.4%
ValueCountFrequency (%)
1 1700
49.3%
6 636
 
18.4%
4 541
 
15.7%
5 268
 
7.8%
7 192
 
5.6%
2 43
 
1.2%
0 23
 
0.7%
3 21
 
0.6%
9 16
 
0.5%
8 8
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 752
100.0%
ValueCountFrequency (%)
Decimal Number 3448
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 376
50.0%
6 140
 
18.6%
4 140
 
18.6%
5 48
 
6.4%
7 48
 
6.4%
ValueCountFrequency (%)
1 1700
49.3%
6 636
 
18.4%
4 541
 
15.7%
5 268
 
7.8%
7 192
 
5.6%
2 43
 
1.2%
0 23
 
0.7%
3 21
 
0.6%
9 16
 
0.5%
8 8
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Common 752
100.0%
ValueCountFrequency (%)
Common 3448
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 376
50.0%
6 140
 
18.6%
4 140
 
18.6%
5 48
 
6.4%
7 48
 
6.4%
ValueCountFrequency (%)
1 1700
49.3%
6 636
 
18.4%
4 541
 
15.7%
5 268
 
7.8%
7 192
 
5.6%
2 43
 
1.2%
0 23
 
0.7%
3 21
 
0.6%
9 16
 
0.5%
8 8
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 752
100.0%
ValueCountFrequency (%)
ASCII 3448
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 376
50.0%
6 140
 
18.6%
4 140
 
18.6%
5 48
 
6.4%
7 48
 
6.4%
ValueCountFrequency (%)
1 1700
49.3%
6 636
 
18.4%
4 541
 
15.7%
5 268
 
7.8%
7 192
 
5.6%
2 43
 
1.2%
0 23
 
0.7%
3 21
 
0.6%
9 16
 
0.5%
8 8
 
0.2%

line_width
Real number (ℝ)

 Original DatasetOversampled Dataset
Distinct100176
Distinct (%)53.2%20.4%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean229.3617258.83991
 Original DatasetOversampled Dataset
Minimum112112
Maximum391492
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size1.6 KiB13.5 KiB
2023-04-15T05:10:26.402047image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

 Original DatasetOversampled Dataset
Minimum112112
5-th percentile179183
Q1194209.25
median222.5254
Q3260296
95-th percentile305.65391
Maximum391492
Range279380
Interquartile range (IQR)6686.75

Descriptive statistics

 Original DatasetOversampled Dataset
Standard deviation43.83363163.555639
Coefficient of variation (CV)0.191111380.24554034
Kurtosis0.331877711.8029008
Mean229.3617258.83991
Median Absolute Deviation (MAD)31.544
Skewness0.575501921.086224
Sum43120223120
Variance1921.38724039.3193
MonotonicityNot monotonicNot monotonic
2023-04-15T05:10:26.592320image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
191 5
 
2.7%
194 4
 
2.1%
183 4
 
2.1%
193 4
 
2.1%
224 4
 
2.1%
203 4
 
2.1%
232 4
 
2.1%
204 4
 
2.1%
207 4
 
2.1%
185 4
 
2.1%
Other values (90) 147
78.2%
ValueCountFrequency (%)
232 14
 
1.6%
321 14
 
1.6%
305 14
 
1.6%
194 13
 
1.5%
218 13
 
1.5%
226 12
 
1.4%
294 11
 
1.3%
206 11
 
1.3%
288 11
 
1.3%
224 11
 
1.3%
Other values (166) 738
85.6%
ValueCountFrequency (%)
112 1
 
0.5%
123 1
 
0.5%
142 1
 
0.5%
163 1
 
0.5%
167 1
 
0.5%
176 1
 
0.5%
177 1
 
0.5%
178 1
 
0.5%
179 3
1.6%
180 2
1.1%
ValueCountFrequency (%)
112 3
0.3%
123 3
0.3%
142 3
0.3%
163 2
 
0.2%
167 2
 
0.2%
176 3
0.3%
177 2
 
0.2%
178 3
0.3%
179 7
0.8%
180 5
0.6%
ValueCountFrequency (%)
112 3
1.6%
123 3
1.6%
142 3
1.6%
163 2
 
1.1%
167 2
 
1.1%
176 3
1.6%
177 2
 
1.1%
178 3
1.6%
179 7
3.7%
180 5
2.7%
ValueCountFrequency (%)
112 1
 
0.1%
123 1
 
0.1%
142 1
 
0.1%
163 1
 
0.1%
167 1
 
0.1%
176 1
 
0.1%
177 1
 
0.1%
178 1
 
0.1%
179 3
0.3%
180 2
0.2%

overspray
Real number (ℝ)

 Original DatasetOversampled Dataset
Distinct119260
Distinct (%)63.3%30.2%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean104.83511137.48144
 Original DatasetOversampled Dataset
Minimum00
Maximum415419
Zeros820
Zeros (%)4.3%2.3%
Negative00
Negative (%)0.0%0.0%
Memory size1.6 KiB13.5 KiB
2023-04-15T05:10:26.836449image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

 Original DatasetOversampled Dataset
Minimum00
5-th percentile13
Q11632
median5998.5
Q3169240.75
95-th percentile341.6368
Maximum415419
Range415419
Interquartile range (IQR)153208.75

Descriptive statistics

 Original DatasetOversampled Dataset
Standard deviation110.08344120.90878
Coefficient of variation (CV)1.05006270.8794553
Kurtosis0.28180928-0.86166332
Mean104.83511137.48144
Median Absolute Deviation (MAD)4985.5
Skewness1.13859650.64165618
Sum19709118509
Variance12118.36314618.933
MonotonicityNot monotonicNot monotonic
2023-04-15T05:10:27.101213image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 8
 
4.3%
10 5
 
2.7%
91 5
 
2.7%
3 5
 
2.7%
7 4
 
2.1%
47 4
 
2.1%
32 4
 
2.1%
220 4
 
2.1%
24 3
 
1.6%
5 3
 
1.6%
Other values (109) 143
76.1%
ValueCountFrequency (%)
0 20
 
2.3%
47 15
 
1.7%
32 14
 
1.6%
3 13
 
1.5%
10 13
 
1.5%
11 12
 
1.4%
15 11
 
1.3%
91 11
 
1.3%
201 11
 
1.3%
24 10
 
1.2%
Other values (250) 732
84.9%
ValueCountFrequency (%)
0 8
4.3%
1 3
 
1.6%
2 3
 
1.6%
3 5
2.7%
4 1
 
0.5%
5 3
 
1.6%
6 1
 
0.5%
7 4
2.1%
8 2
 
1.1%
9 1
 
0.5%
ValueCountFrequency (%)
0 20
2.3%
1 7
 
0.8%
2 8
 
0.9%
3 13
1.5%
4 3
 
0.3%
5 5
 
0.6%
6 7
 
0.8%
7 10
1.2%
8 7
 
0.8%
9 3
 
0.3%
ValueCountFrequency (%)
0 20
10.6%
1 7
 
3.7%
2 8
 
4.3%
3 13
6.9%
4 3
 
1.6%
5 5
 
2.7%
6 7
 
3.7%
7 10
5.3%
8 7
 
3.7%
9 3
 
1.6%
ValueCountFrequency (%)
0 8
0.9%
1 3
 
0.3%
2 3
 
0.3%
3 5
0.6%
4 1
 
0.1%
5 3
 
0.3%
6 1
 
0.1%
7 4
0.5%
8 2
 
0.2%
9 1
 
0.1%

roughness
Real number (ℝ)

 Original DatasetOversampled Dataset
Distinct94132
Distinct (%)50.0%15.3%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean98.037234113.33991
 Original DatasetOversampled Dataset
Minimum4343
Maximum192228
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size1.6 KiB13.5 KiB
2023-04-15T05:10:27.344998image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

 Original DatasetOversampled Dataset
Minimum4343
5-th percentile58.3563
Q17582.25
median91112
Q3117.25143
95-th percentile152.65165
Maximum192228
Range149185
Interquartile range (IQR)42.2560.75

Descriptive statistics

 Original DatasetOversampled Dataset
Standard deviation30.76604335.832806
Coefficient of variation (CV)0.313819980.31615348
Kurtosis0.11304357-0.6006769
Mean98.037234113.33991
Median Absolute Deviation (MAD)1930.5
Skewness0.733424340.24952404
Sum1843197699
Variance946.549411283.99
MonotonicityNot monotonicNot monotonic
2023-04-15T05:10:27.579099image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
77 8
 
4.3%
68 8
 
4.3%
73 6
 
3.2%
99 6
 
3.2%
84 6
 
3.2%
75 5
 
2.7%
85 5
 
2.7%
108 5
 
2.7%
72 5
 
2.7%
117 4
 
2.1%
Other values (84) 130
69.1%
ValueCountFrequency (%)
77 22
 
2.6%
68 21
 
2.4%
84 20
 
2.3%
99 19
 
2.2%
85 18
 
2.1%
144 17
 
2.0%
108 17
 
2.0%
72 17
 
2.0%
75 15
 
1.7%
95 15
 
1.7%
Other values (122) 681
79.0%
ValueCountFrequency (%)
43 1
0.5%
44 1
0.5%
45 1
0.5%
48 2
1.1%
49 2
1.1%
54 1
0.5%
57 1
0.5%
58 1
0.5%
59 1
0.5%
60 1
0.5%
ValueCountFrequency (%)
43 3
0.3%
44 3
0.3%
45 2
 
0.2%
46 1
 
0.1%
47 1
 
0.1%
48 6
0.7%
49 5
0.6%
54 4
0.5%
57 4
0.5%
58 2
 
0.2%
ValueCountFrequency (%)
43 3
1.6%
44 3
1.6%
45 2
 
1.1%
46 1
 
0.5%
47 1
 
0.5%
48 6
3.2%
49 5
2.7%
54 4
2.1%
57 4
2.1%
58 2
 
1.1%
ValueCountFrequency (%)
43 1
0.1%
44 1
0.1%
45 1
0.1%
48 2
0.2%
49 2
0.2%
54 1
0.1%
57 1
0.1%
58 1
0.1%
59 1
0.1%
60 1
0.1%

Interactions

Original Dataset

2023-04-15T05:10:19.765835image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-04-15T05:10:23.151121image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-04-15T05:10:17.343102image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-04-15T05:10:21.067959image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-04-15T05:10:17.979152image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-04-15T05:10:21.615634image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-04-15T05:10:18.538912image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-04-15T05:10:22.113060image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-04-15T05:10:19.231670image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-04-15T05:10:22.602475image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-04-15T05:10:19.905475image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-04-15T05:10:23.245152image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-04-15T05:10:17.489325image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-04-15T05:10:21.180261image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-04-15T05:10:18.090234image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-04-15T05:10:21.746479image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-04-15T05:10:18.655853image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-04-15T05:10:22.207963image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-04-15T05:10:19.369505image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-04-15T05:10:22.747055image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-04-15T05:10:19.996433image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-04-15T05:10:24.055159image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-04-15T05:10:17.611084image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-04-15T05:10:21.289984image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-04-15T05:10:18.208491image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-04-15T05:10:21.831900image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-04-15T05:10:18.806041image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-04-15T05:10:22.298317image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-04-15T05:10:19.474174image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-04-15T05:10:22.868240image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-04-15T05:10:20.143256image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-04-15T05:10:24.180717image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-04-15T05:10:17.739498image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-04-15T05:10:21.396622image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-04-15T05:10:18.293712image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-04-15T05:10:21.928928image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-04-15T05:10:18.956559image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-04-15T05:10:22.402038image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-04-15T05:10:19.559503image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-04-15T05:10:22.965364image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-04-15T05:10:20.309771image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-04-15T05:10:24.281200image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-04-15T05:10:17.852272image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-04-15T05:10:21.492935image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-04-15T05:10:18.414599image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-04-15T05:10:22.017390image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-04-15T05:10:19.087349image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-04-15T05:10:22.495284image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-04-15T05:10:19.644111image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-04-15T05:10:23.055874image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2023-04-15T05:10:27.771078image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
timevelocityline_widthoversprayroughnessdistanceink_visco_cpsurface_tension_dyne_cmink _density
time1.0000.023-0.042-0.067-0.1220.6870.2750.2750.275
velocity0.0231.0000.3000.0620.1360.4820.2780.2780.278
line_width-0.0420.3001.0000.2900.6190.0000.0000.0000.000
overspray-0.0670.0620.2901.0000.2290.0000.1980.1980.198
roughness-0.1220.1360.6190.2291.0000.2020.2710.2710.271
distance0.6870.4820.0000.0000.2021.0000.1510.1510.151
ink_visco_cp0.2750.2780.0000.1980.2710.1511.0000.9860.986
surface_tension_dyne_cm0.2750.2780.0000.1980.2710.1510.9861.0000.986
ink _density0.2750.2780.0000.1980.2710.1510.9860.9861.000

Missing values

Original Dataset

2023-04-15T05:10:20.560000image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.

Oversampled Dataset

2023-04-15T05:10:24.434936image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.

Original Dataset

2023-04-15T05:10:20.818290image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Oversampled Dataset

2023-04-15T05:10:24.555907image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Original Dataset

distancetimevelocityink_visco_cpsurface_tension_dyne_cmink _densityline_widthoversprayroughness
027034.07.9416.330.9151729412164
127034.07.9416.330.91517261136141
230038.07.8956.330.9151721811103
330044.06.8186.330.915171901568
430041.07.3176.330.915171909190
530040.07.5006.932.31614180062
630038.07.8956.932.316141788082
730043.06.9776.330.9151718524145
830043.06.9776.330.9151721350161
930034.08.8246.330.915173238171

Oversampled Dataset

distancetimevelocityink_visco_cpsurface_tension_dyne_cmink _densityline_widthoversprayroughness
090069.67498313.3775536.90000032.3000001614434106120
188762.74054014.2448706.89945032.350373161628899118
286762.36563314.3242696.91385332.276992161728798116
390075.14247612.4203416.90000032.300000161443188118
488764.43169314.3115846.90169032.2425921612286113119
588793.9433139.5655376.90506832.362716161728710985
692195.5793109.5135396.90827832.289222161228610282
791692.3290319.6785766.91466532.312416161628510984
890193.8879789.6169806.95075832.294846161329111585
990894.6395589.3530116.90514732.249359161928710284

Original Dataset

distancetimevelocityink_visco_cpsurface_tension_dyne_cmink _densityline_widthoversprayroughness
178900108.08.3333336.932.3161421228272
17990093.09.6774196.932.3161432347157
18090093.09.6774196.932.31614305201108
18190094.09.5744686.932.3161428810785
18290095.09.4736846.932.3161429024115
18390096.09.3750006.932.316142621794
18490096.09.3750006.932.316142411586
18590096.09.3750006.932.316141917787
186900108.08.3333336.932.31614188173
187900107.08.4112156.932.31614203545

Oversampled Dataset

distancetimevelocityink_visco_cpsurface_tension_dyne_cmink _densityline_widthoversprayroughness
175900107.08.4112156.932.316141947276
177900108.08.3333336.932.316141918199
178900108.08.3333336.932.3161421228272
17990093.09.6774196.932.3161432347157
18090093.09.6774196.932.31614305201108
18190094.09.5744686.932.3161428810785
18290095.09.4736846.932.3161429024115
18390096.09.3750006.932.316142621794
18490096.09.3750006.932.316142411586
186900108.08.3333336.932.31614188173

Duplicate rows

Original Dataset

distancetimevelocityink_visco_cpsurface_tension_dyne_cmink _densityline_widthoversprayroughness# duplicates
Dataset does not contain duplicate rows.

Oversampled Dataset

distancetimevelocityink_visco_cpsurface_tension_dyne_cmink _densityline_widthoversprayroughness# duplicates
027034.07.9416.330.915172611361413
127034.07.9416.330.91517294121643
230031.09.6776.330.915172181871173
330031.09.6776.330.91517267801393
430032.09.3756.330.915173061021453
830034.08.8246.330.9151722627953
930034.08.8246.330.9151732381713
1030034.08.8246.932.316142192131853
1230036.08.3336.932.3161414227763
1430037.08.1086.932.316142742831493